Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

نویسندگان

چکیده

We consider the post-training quantization problem, which discretizes weights of pre-trained deep neural networks without re-training model. propose multipoint quantization, a method that approximates full-precision weight vector using linear combination multiple vectors low-bit numbers; this is in contrast to typical methods approximate each single low precision number. Computationally, we construct with an efficient greedy selection procedure, and adaptively decides number points on quantized based error its output. This allows us achieve higher levels for important greatly influence outputs, yielding ``effect mixed precision'' but physical implementations (which requires specialized hardware accelerators). Empirically, our can be implemented by common operands, bringing almost no memory computation overhead. show outperforms range state-of-the-art ImageNet classification it generalized more challenging tasks like PASCAL VOC object detection.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mixed Precision Training

Increasing the size of a neural network typically improves accuracy but also increases the memory and compute requirements for training the model. We introduce methodology for training deep neural networks using half-precision floating point numbers, without losing model accuracy or having to modify hyperparameters. This nearly halves memory requirements and, on recent GPUs, speeds up arithmeti...

متن کامل

Mixed-Precision Memcomputing

As the CMOS scaling laws break down because of technological limits, a radical departure from the processor-memory dichotomy is needed to circumvent the limitations of today’s computers. In-memory computing is a promising concept in which the physical attributes and state dynamics of nanoscale resistive memory devices organized in a computational memory unit are exploited to perform computation...

متن کامل

Mixed Precision Vector Processors

Mixed-Precision Vector Processors

متن کامل

Sound Mixed-Precision Optimization with Rewriting

Finite-precision arithmetic computations face an inherent tradeo between accuracy and e ciency. The points in this tradeo space are determined, among other factors, by di erent data types but also evaluation orders. To put it simply, the shorter a precision’s bit-length, the larger the roundo error will be, but the faster the program will run. Similarly, the fewer arithmetic operations the prog...

متن کامل

Accelerating Scientific Computations with Mixed Precision Algorithms

a Department of Mathematics, University of Coimbra, Coimbra, Portugal b French National Institute for Research in Computer Science and Control, Lyon, France c Department of Electrical Engineering and Computer Science, University Tennessee, Knoxville, TN, USA d Oak Ridge National Laboratory, Oak Ridge, TN, USA e University of Manchester, Manchester, United Kingdom f Department of Mathematical an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i10.17054